Structure-Preserving Pipelines for Digital Libraries

نویسندگان

  • Massimo Poesio
  • Eduard Barbu
  • Egon Stemle
  • Christian Girardi
چکیده

Most existing HLT pipelines assume the input is pure text or, at most, HTML and either ignore (logical) document structure or remove it. We argue that identifying the structure of documents is essential in digital library and other types of applications, and show that it is relatively straightforward to extend existing pipelines to achieve ones in which the structure of a document is preserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

شاخص های طراحی و ارزیابی کتابخانه های دیجیتالی

Introduction: There was always suspicion regarding concept and frameworks of digital libraries concepts such as electronic library, virtual library, without wall library, hybrid library and digital library have applied often together, or for each other for conveying library concept. Studies have shown that so far there is no standard and universal accepted definition for digital libraries, howe...

متن کامل

Proposed content framework for digital literacy education to users in Iran

Aim: today, digital literacy, as a set of skills that enable people to use digital space effectively for success in personal, educational and professional life, has become a necessity in all societies and public libraries are one of the most important providers of digital literacy education in the world. Digital literacy education has not been considered in public libraries in Iran. The first s...

متن کامل

A Systematic Review of Data Mining Applications in Digital Libraries

Purpose: Study aimed to identify the applications of data mining in the provision of services, collection and management of digital libraries. Methodology: This is an applied study in terms of purpose and in terms of method is qualitative research that have been done by systematic review method. For this purpose, articles have been obtained by searching databases of Springer, Emerald, ProQuest,...

متن کامل

Critical Success Factors of Digital Libraries in Iran: A Qualitative Research

Background and Aim: Myriad of IT projects failed in recent years. Digital libraries (DLs) as the product of the usage of IT in the library organization followed a similar trend. This paper studies the critical success factors (CSFs) of DLs in the context of Iran, with special focus on the Iranian Ministry of Science, Research, and Technology. CSFs, in this paper, are those factors that if follo...

متن کامل

بررسی میزان رعایت معیارهای ارزیابی رابط کاربر در صفحات وب فارسی کتابخانه‌های دیجیتالی خودساخته و خریداری شده در ایران

Purpose: Concerning digital libraries, interaction between user and system is among major issues for using library software. Therefore, finding appropriate software for this purpose is of high importance. This study aims to evaluate and analyze the criteria related to user interface in Farsi web pages of self-made and purchased digital libraries in Iran. Methodology: This is an applied and eva...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011